A randomized controlled trial of a proportionate universal parenting program delivery model (E-SEE Steps) to enhance child social-emotional wellbeing

Background Evidence for parenting programs to improve wellbeing in children under three is inconclusive. We investigated the fidelity, impact, and cost-effectiveness of two parenting programs delivered within a longitudinal proportionate delivery model (‘E-SEE Steps’). Methods Eligible parents with a child ≤ 8 weeks were recruited into a parallel two-arm, assessor blinded, randomized controlled, community-based, trial with embedded economic and process evaluations. Post-baseline randomization applied a 5:1 (intervention-to-control) ratio, stratified by primary (child social-emotional wellbeing (ASQ:SE-2)) and key secondary (maternal depression (PHQ-9)) outcome scores, sex, and site. All intervention parents received the Incredible Years® Baby Book (IY-B), and were offered the targeted Infant (IY-I)/Toddler (IY-T) program if eligible, based on ASQ:SE-2/PHQ-9 scores. Control families received usual services. Fidelity data were analysed descriptively. Primary analysis applied intention to treat. Effectiveness analysis fitted a marginal model to outcome scores. Cost-effectiveness analysis involved Incremental Cost-Effectiveness Ratios (ICERs). Results The target sample (N = 606) was not achieved; 341 mothers were randomized (285:56), 322 (94%) were retained to study end. Of those eligible for the IY-I (n = 101), and IY-T (n = 101) programs, 51 and 21 respectively, attended. Eight (of 14) groups met the 80% self-reported fidelity criteria. No significant differences between arms were found for adjusted mean difference scores; ASQ:SE-2 (3.02, 95% CI: -0.03, 6.08, p = 0.052), PHQ-9 (-0.61; 95% CI: -1.34, 0.12, p = 0.1). E-SEE Steps had higher costs, but improved mothers’ Health-related Quality of Life (0.031 Quality Adjusted Life Year (QALY) gain), ICER of £20,062 per QALY compared to control. Serious adverse events (n = 86) were unrelated to the intervention. Conclusions E-SEE Steps was not effective, but was borderline cost-effective. The model was delivered with varying fidelity, with lower-than-expected IY-T uptake. Changes to delivery systems and the individual programs may be needed prior to future evaluation. Trial registration International Standard Randomized Controlled Trial Number: ISRCTN11079129.

to the completion of a data access request form and, if approved, subject to a data sharing agreement, due to: a) Data containing potentially sensitive participant information such as mental health and domestic violence. b) Ethical concerns around using the data in a way that is not consistent with the PIS, e.g. for research that does not have ethical approval. Data requests will be reviewed by a data access committee, which will include members of the trial management team and independent members from ARC-YH Best Start Steering Committee. A data sharing agreement will be required to ensure data is used in accordance with the trial funder, and ethical guidelines.
Funding: The authors confirm the independence of researchers from funders and that all authors had full access to the study data (including statistical reports and tables) and take responsibility for the integrity of the data and the accuracy of the analysis. All authors were supported by the grant, National Institute for Health Research (NIHR) appropriate programs (infant, toddler, etc. up to age 12 years) which are well suited to a proportionate, longitudinal, universal delivery model. The objectives of the study were to assess if 'E-SEE Steps' can: • Enhance child social-emotional wellbeing at 20 months of age when compared with services as usual, • be delivered with fidelity as a proportionate, longitudinal, universal model, • be cost-effective at 20 months when compared with services as usual.

Study design
The study involved a multi-center pragmatic parallel two-arm, assessor blinded, randomized controlled trial (RCT) with embedded process and economic evaluations, within community settings in England. Recruitment began in May 2017 and data were collected at home visits by trained data collectors at baseline, follow-up 1 (FU1) (2 months post baseline), follow-up 2 (FU2) (9 months post baseline) and follow-up 3 (FU3) (18 months post-baseline). Follow-ups were completed in March 2020. We evaluated the overall effect of IY (delivered in the context of E-SEE Steps) on child social-emotional wellbeing and parent depression at 20 months of age. A pilot study [11] (N = 205) informed the trial leading to amendments (e.g. changes to sample size and random allocation ratio) which can be found in the full protocol (see https:// www.dev.fundingawards.nihr.ac.uk/award/13/93/10) and published protocol [12]. We followed CONSORT (see S1 Table), CHEERS and TIDieR reporting guidance.

Participants and settings
Eligible parents with a child � 8 weeks were recruited from community settings across four local authorities in England (two North, one Mid and one in the South). Parents had to be willing to be randomized, able to receive the intervention and to provide written informed consent. Parents were excluded if they were enrolled on another group-based program or had a child with obvious/diagnosed organic child developmental difficulties. Health visitors and family services asked eligible families if they would like information on the study. Those who agreed were contacted, with consent, by the research team. Researchers obtained written informed consent during home visits in accordance with ethical guidance and approvals. Parents could also self-refer and invite co-parents to participate. Families received a shopping voucher at each data collection point as a small 'thank you' for their contributions (increasing in £5 increments at each time-point, from £15-£30).
Other secondary outcome measures included The Parent Sense of Competence (PSOC) [17] which assesses parent satisfaction and efficacy, and the CARE Index (Infancy) [18], which is an observational measure of parent-child relationships. Both were administered at all timepoints. The Strengths and Difficulties Questionnaire (SDQ: 2-4-yr Version) [19] which assesses child behavior and emotions, and the Maternal Postnatal Attachment Scale (MPAS) [20] which assesses maternal bonding were administered at the final timepoint only.

Sample size calculation
Sample size was calculated using the ASQ:SE-2, and the values for key design parameters were informed by, and estimated from, the pilot study [11]; for further information see the published protocol [12] and full protocol (https://www.dev.fundingawards.nihr.ac.uk/award/13/ 93/10). The clinically important difference at FU3 (18 months post-baseline) was defined as 5 units of the ASQ:SE-2. We expected a consistent effect over the three follow-ups, with an assumed SD of 18 on the ASQ:SE-2 at FU3. The correlation between baseline and FU3 was 0.26, and between pairs of measurements after baseline, was 0.4. Due to the group-based nature of the intervention, a design effect of 1.25 was applied as an inflation factor for the intervention arm. We required two-sided 5% significance level and 90% power. A 5:1 randomization ratio, intervention to control, was necessary to ensure that sufficient parents would meet the proportionate criteria to attend the parent programs, with a viable group size. A target of N = 606 allowed for 12% attrition; 441 intervention and 92 control parents needed to be retained.

Randomization and blinding
Randomization was conducted post-baseline by EpiGenysis at the University of Sheffield, using a web-based system with a 5:1 (intervention to control) ratio. Stratification variables included baseline PHQ-9, child ASQ:SE-2 scores, child and parent sex, and research site. All fieldworkers, referral agents, the chief investigator, statisticians (until final analysis), and the Trial Steering Committee, were blind to allocation. Participants, IY leaders, trial and data managers, and the process evaluation team, were not blind.

Intervention
E-SEE Steps comprises two IY programs (IY-I and IY-T) delivered in a proportionate, longitudinal, universal intervention model with three steps-one universal, and two subsequent targeted/indicated steps, as the children age (S1 Fig). The IY-B was posted to all intervention parents to increase awareness of their babies' socioemotional needs. The IY-I and IY-T targeted group sessions were delivered weekly in collaborative two-hour sessions which include video clips of real-life situations and group discussions, plus exercises to practice at home. IY is underpinned by both social learning and attachment theory [21,22]. Program content is summarized in S2 Table. E-SEE-Steps was delivered by Early Years Children's Services and/or Public Health Nursing staff, who were trained by accredited IY mentors (and supervised regularly) to deliver IY as part of the trial.
Parents were eligible for the IY-I or IY-T programs if they were obtained 'mildly depressed' or higher scores on the PHQ-9, or if their child scored in the 'monitoring zone or above' on the ASQ:SE-2 (suggesting potential social-emotional issues) at follow-up 1 or 2. The research team contacted parents, if eligible for IY-I/T, and sites engaged with parents in relation to program attendance. There were four possible intervention 'doses' that the trial sample could have, dependent on their level of need: IY-B only; IY-B plus IY-I; IY-B plus IY-I and IY-T; or IY-B plus IY-T (for IY-I and IY-T logic models see http://www.incredibleyears.com/about/ incredible-years-series/series-goals/). The control group/arm received services as usual (SAU) which included a range of supports, including behavior management, healthy weight/nutrition, early learning and development, and post-natal support. IY-B, IY-I and IY-T were not offered as SAU in trial delivery sites.

Process evaluation
Fidelity monitoring data included receipt of the IY-B, and IY-I and IY-T group attendance and parent satisfaction (using standard IY weekly feedback forms), leader self-rated adherence using IY weekly checklists, and researcher-rated implementation fidelity using the Parent Programme Implementation Checklist (PPIC) [23]. The PPIC measures adherence, quality of delivery and participant responsiveness. Barriers and facilitators to delivery, and stakeholder experiences, are reported separately [24].

Economic evaluation
The cost-effectiveness evaluation utilized data from an adapted Client Services Receipt Inventory (CSRI) [25] which assessed parent and child access to health, social and community services. The SDQ [19], which measures child behavior and emotions, the Pediatric Quality of Life Inventory (PEDsQL) [26], and EQ-5D5L [27], which assesses adult health dimensions of mobility, self-care, usual activities, pain/discomfort, and anxiety/depression were used to calculate Quality Adjusted Life Years (QALYs) [28].

Ethical considerations
The study was approved by the National Health Service (NHS) North Wales Research Ethics Committee (REC) 5, Bangor on 22nd May 2015 (REC Reference: 15/WA/0178, IRAS 173946), and by Departmental Ethics Committee, University of York on 10th August 2015 (Reference: FC15/03). All participants provided written informed consent.

Analysis
For the effectiveness analysis, a marginal model was fitted to the ASQ:SE-scores of children when approximately 4, 11 and 20-months old (FUs 1-3), using general estimating equations with a Gaussian family, identity link, robust standard errors and autoregressive covariance structure of order 1 AR(1). STATA/MP 16.0 was used, with a two-sided test at the 5% level. Primary analysis applied intention to treat.
Baseline prognostic factors, potential confounding factors, follow-up time and delivery site were included as covariates. Sensitivity analyses assessed the robustness of the primary analysis using the standard techniques in the RCT literature. For example, item non-response was imputed using questionnaire developer rules, and missing outcomes were explored by Multiple Imputation using Chained Equations (MICE) [29]. For further details see our Statistical Analysis Plan (SAP) at https://www.york.ac.uk/media/healthsciences/documents/research/ public-health/e-see/1_Statistical%20Analysis%20Plan%20(main%20trial).pdf Prior to database close and un-blinding four changes were agreed and made to the analysis model.
1. An original multilevel mixed model with treatment group and participants as random effects was replaced with a marginal model fitted using GEE. We no longer accounted for treatment group clustering because the offer of IY-I and IY-T was conditional on FU1 and FU2 outcomes, so clustering was confounded with treatment effect. We used a marginal model because accounting for repeated measures using a mixed model, inflates the Type 1 error, or gives a biased estimate of the treatment effect. Simulations conducted during SAP development, suggested estimates from this alternative model were robust to Inter-Cluster Correlations (ICCs) below 0.2.
2. Cluster-level analysis using summary measures is no longer included because participants can get IY-I alone, IY-T alone or both, so there is no way of grouping participants into clusters that remain stable throughout the intervention.
3. The sex of primary caregiver covariate is not used because findings from the pilot showed no male primary caregivers for the associated model parameter to be estimated.

Per protocol and Complier
Average Causal Effect (CACE) analysis were not conducted as there is no satisfactory way of defining compliers without biasing the estimated impact of IT-I and IY-T on compliers. This is due to the conditional design whereby eligible participants have already scored highly on the outcome measure. Descriptive analysis of the characteristics associated with compliance was undertaken.
Fidelity monitoring data were analyzed descriptively using means/medians and percentages.
Cost-effectiveness was assessed using incremental cost per QALY [28] gained of E-SEE steps compared with SAU. Analyses were conducted with probabilistic sensitivity analyses used to estimate the uncertainty around the adoption decision. Sensitivity analyses determined the robustness of the results to altering leading assumptions, see S1 Text.
Costs were estimated from a public sector perspective and calculated by applying published national (UK) cost estimates to relevant resource use. Costs and effects were discounted at 3.5% per annum as per national guidance from the National Institute of Health and Care Excellence (NICE) [30]. Outcomes were assessed in terms of QALYs [28], using SDQ [19] mapped to PEDsQL for children [26], and EQ-5D5L for adults [27].

Results
A total of 341 eligible mothers (from a potential 493) were randomized (see Fig 1) and their data analyzed; 322 (94%) were retained at trial end (6 withdrew, and 1 was withdrawn by the CI). The target sample size of 606 was not achieved. Mothers' mean age was 30.9 (5.0) years, mean child age was 6 (2.1) weeks (see Table 1). No major imbalances between arms at baseline existed in terms of covariates and baseline outcome scores.

Process evaluation-Fidelity, intervention take-up and delivery
The (universal) IY-B was posted to all intervention families. Fifty-one from 101 eligible at FU1 received at least one session of IY-I, and 21 from 101 eligible at FU2 received at least one IY-T session (see Fig 1 and S3 Table). We expected uptake to be 50 and 48 respectively, showing a lower-than-expected IY-T uptake. Attendance levels reduced over time; 80% attended the fifth  session, which reduced to 45% for IY-I at session 8 (of 9) and 43% at session 10 (of 12) for IY-T. Average individual session attendance was 6.5 and 6.4 for IY-I and IY-T respectively. Parents who attended at least one IY-I/IY-T session and parents who were invited but did not attend did not differ on outcomes, although better educated parents in higher income bands were marginally more likely to take up the intervention; numbers were small and therefore limit any definitive conclusions, see S4 Table. In addition, we compared participants eligible for IY-I and IY-T with subgroups of control participants with eligible ASQ:SE-2 and PHQ-9 scores (pseudo controls), using the same model as for the primary outcome. No differences were found between arms, see S5 Table). Parents with lower depression (PHQ-9) scores were more likely to attend at least one IY-I session. There was no difference in attendance by ASQ: SE-2 scores.

PLOS ONE
Weekly parent satisfaction with content and process was high, averaging 3.4 and 3.7 (out of 4), for IY-I and IY-T respectively.
Six (of eight) IY-I and two (of six) IY-T groups met the trial-set criteria of 80% on selfreported fidelity (adherence). The independent PPIC observation [23] yielded average fidelity rates across quality, adherence, and responsiveness of 64% and 74% for IY-I and IY-T respectively ( S2 Fig).

Effectiveness evaluation
The findings show that ESEE-Steps was not effective in enhancing child social-emotional wellbeing compared to the control arm. Primary analyses found a borderline statistically significant difference in favor of the control arm (3.02, 95%CI: -0.03, 6.08, p = 0.052) ( Table 2). On average ASQ:SE-2 scores tended to be 3 units higher over the three FUs in the E-SEE steps arm when compared to controls. Unplanned sensitivity analyses were performed due to skewed data and the implication of this with a large arm size imbalance, due to the randomization ratio of 5:1. Sensitivity analysis provided no evidence that the Minimal Clinically Important Difference (MCID) was reached, see S6 Table. The difference between groups was reduced for the ASQ:SE2, but did not change the primary analysis (2.56, 95%CI: -0.69, 5.80, p = 0.122). The results did not differ depending on parent education, child sex or if their child was first-born.
Primary analysis found no significant differences between arms for the key secondary outcome of parent depression ( Table 2) adjusted mean difference = -0.61; 95% CI (-1.34, 0.12); p = 0.1). Sensitivity increased the difference between groups (-0.64; CI (-1.35, 0.07); p = 0.077), but did not alter the primary analysis findings, see S6 Table. Other secondary outcomes showed no arm differences on any measures including for how children were fed (e.g. breast, bottle, mixed. See S7 Table. Economic evaluation E-SEE Steps had higher costs (£2,610 vs £1,989) and QALYs (2.618 vs 2.587) compared to SAU over the trial period, resulting in an ICER of £20,062 per QALY compared to services as usual (see Table 3).
The small gain in mean QALYs in adults outweighed minor decrements reported in child outcomes over the trial period. All scenarios found E-SEE Steps cost-effective at the maximum recommended threshold of £30,000 per QALY, see S8 Table. The probability of E-SEE Steps being cost-effective was estimated at 36%, 49% and 67% for £15,000, £20,000, and £30,000 cost-effectiveness thresholds, respectively.
Post-randomization adverse events (serious = 86; other = 96) adverse events (AEs) were recorded, and included injuries or conditions arising from childbirth, and common infant ailments such as bronchitis; all were unrelated to the intervention and there were no differences between arms regarding their proportion or nature.

Discussion
The findings show no positive effect for E-SEE Steps on child social-emotional wellbeing at 20 months when compared to the control arm. ASQ:SE-2 scores declined (worsened) for both arms, but the intervention arm declined more. No significant effect was found for the key secondary outcome, parental depression; sensitivity analyses strengthened the signal in favor of the intervention, but it was not significant. No statistically significant effects were found for any secondary outcomes. Parent take-up of IY-T was lower than expected, and fidelity of delivery for IY-I and IY-T was mixed, both of which may have influenced the findings. The cost-effectiveness of E-SEE Steps was contingent on relatively modest differentials in parental health-related quality of life (HRQoL) that were short in duration, partially offset by reductions in child HRQoL.
This study is the first in the UK to explore the use of a proportionate, longitudinal, universal delivery model with a specific parent intervention. This trial showed no evidence of effectiveness for E-SEE Steps overall, and it was not possible to assess the IY-I or IY-T programs for effectiveness as 'stand-alone' interventions in this model. Other RCTs of stand-alone interventions to support child outcomes in the very early years have also failed to find an effect, e.g. [31] study of the Family Nurse Partnership trial, however this focused on parent outcomes during and after pregnancy (which is typically a pre-requisite for child outcome changes). Triple-P Baby is a program from a suite of Triple-P programs (https://www.triplep.net/glo-en/ home/), as is Mellow Bumps (https://www.mellowparenting.org/). These programs are for parents during/after pregnancy and are currently undergoing trial. Although no effectiveness results have been published for these parent and baby programs the Mellow Bumps trial 'THRIVE' process evaluation findings suggest that vulnerable families did not benefit and felt marginalized, and that more is needed to support such families in attending the parent programs for the full duration [32]. However, a controlled trial in Ireland that investigated IY-I as part of a wraparound service (called the 'Up2Two') and found parenting efficacy and child cognitive stimulation effects [33]. Overall, more work is needed to identify effective parenting interventions for families with infants/toddlers. This study had several key strengths. The proportionate universal trial design reflected realworld services addressed different familial levels of need. E-SEE Steps combined universal preventative and early intervention/treatment elements. Low levels of missing data and a high participant retention (94%) somewhat mitigated against not achieving target recruitment, retaining sufficient power to address the main research question. An independent observational outcome measure was used, in addition to parent report, and a robust measure selection strategy was undertaken.
However, the study was not powered to establish the effectiveness of each of the individual three E-SEE Steps (or four possible doses) with the sample. Low IY group numbers and attendance rates (and small control n) meant that planned secondary analysis to explore each level of intervention could not be conducted. Sample representativeness is also questionable; 45% of mothers had an undergraduate degree or higher (lower than the national average of 57%, see ONS data), and 11% of parents were single/not in a "live-in" relationship (lower than the national average of 23-25%-according to 2019 Gingerbread and Office for National Statistics (see Table 1 in Families and households).
Despite careful measure selection, caution is needed in interpretation; the ASQ:SE-2 (which is routinely used in the UK for 24-month child developmental assessments) and the observational Infant Care Index are not validated in the UK. The SDQ (2-4-yr Version) [19] is the is not validated for the trial age-group (20-months-old), but we used the youngest age SDQ version available; Infant Care Index analysis was conducted on a subset with complete data at all timepoints. The lack of appropriate and robust measures across infancy and toddlerhood [13] highlights an important need for more psychometric studies in this area. The confidence intervals in the sensitivity analysis included an MCID of +5 on the ASQ:SE-2, which could be considered as the opposite of positive clinical effect which we defined as -5, although we have insufficient information on which to base this claim. Long-term outcomes could not be measured within the trial period.
The low conversion rates from eligibility to accessing at least one session (IY-I = 50%, IY-T = 21%) could suggest difficulties in engaging parents, or that the program was not attractive to parents, and/or they were too overwhelmed to participate. We found that parents with higher levels of depressive symptoms were less likely to attend IY-I. Mental health provision during pregnancy and the perinatal period in the UK remains limited, despite NICE guidance [34] and the potential negative impact on children. It is possible that more engagement work with families is needed to encourage take-up, or to offer families alternative supports as appropriate. The lower take-up of IY-T could also reflect a return to work and greater flexibility, therefore, on the timing of group delivery, may be needed.
Less than half of parents who attended the targeted programs completed them, although 80% were still attending at week 5, suggesting that parents may prefer/can commit to shorter programs. Low uptake and retention rates likely impacted the findings and this, combined with varying levels of fidelity, suggests that system and possibly program changes may be needed (Berry et al., submitted). A pre-intervention component to identify, engage and retain parents, and those with low mood, may help to reduce attendance barriers [35]. Given the uncertainty around long-term parental and child outcomes, the cost-effectiveness of E-SEE Steps remains equivocal.
Although IY-I and IY-T could not be individually assessed for effectiveness in this study, IY-B will be explored by combining pilot [11] and main trial data. We expected similar (not different) trajectories for the primary and key secondary outcomes given the relationship between parent mental health and child social emotional wellbeing. A longer-term follow-up could explore whether E-SEE works preventatively or not, i.e. intervention family outcomes are sustained but control families "worsen" in comparison.

Conclusions
E-SEE Steps, a proportionate universal (stepped) delivery model of a program for parents of infants and toddlers was challenging to implement, had lower than expected parental uptake for IY-T, and was not effective in enhancing child social emotional wellbeing or reducing parent depression.
E-SEE Steps was borderline cost-effective over the period of the trial, but cost-effectiveness over the longer term will depend on the persistence of modest effects on parent mental health.
Collectively, the findings suggest that the current model cannot yet be recommended for use. Changes to the delivery systems, and to the individual programs within the model, may be needed prior to any future trials of this model.
The evidence gap for parent programs for children under age two to enhance child social emotional wellbeing remains, and further research is needed to establish the most appropriate means to support early child wellbeing in a preventive and indicated way.